Search CORE

170 research outputs found

antGLasso: An Efficient Tensor Graphical Lasso Algorithm

Author: Andrew Bailey
Cutillo Luisa
Westhead David
Publication venue
Publication date: 05/11/2022
Field of study

The class of bigraphical lasso algorithms (and, more broadly, 'tensor'-graphical lasso algorithms) has been used to estimate dependency structures within matrix and tensor data. However, all current methods to do so take prohibitively long on modestly sized datasets. We present a novel tensor-graphical lasso algorithm that analytically estimates the dependency structure, unlike its iterative predecessors. This provides a speedup of multiple orders of magnitude, allowing this class of algorithms to be used on large, real-world datasets.Comment: 9 pages (21 including supplementary material), 8 figures, submitted to the GLFrontiers workshop at NeurIPS 202

arXiv.org e-Print Archive

A primer on learning in Bayesian networks for computational biology

Author: Andrew J Bulpitt
Chris J Needham
David R Westhead
Fran Lewitter
James R Bradford
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2007
Field of study

Crossref

Directory of Open Access Journals

PubMed Central

White Rose Research Online

TMB-Hunt: a web server to screen sequence sets for transmembrane β-barrel proteins

Author: Agnew Alison
Garrow Andrew G.
Westhead David R.
Publication venue: Oxford University Press
Publication date: 27/06/2005
Field of study

TMB-Hunt is a program that uses a modified k-nearest neighbour (k-NN) algorithm to classify protein sequences as transmembrane β-barrel (TMB) or non-TMB on the basis of whole sequence amino acid composition. By including differentially weighted amino acids, evolutionary information and by calibrating the scoring, a discrimination accuracy of 92.5% was achieved, as tested using a rigorous cross-validation procedure. The TMB-Hunt web server, available at , allows screening of up to 10 000 sequences in a single query and provides results and key statistics in a simple colour coded format

Crossref

PubMed Central

TMB-Hunt: An amino acid composition based method to screen proteomes for beta-barrel transmembrane proteins

Author: Agnew Alison
Garrow Andrew G
Westhead David R
Publication venue: BioMed Central
Publication date: 01/01/2005
Field of study

BACKGROUND: Beta-barrel transmembrane (bbtm) proteins are a functionally important and diverse group of proteins expressed in the outer membranes of bacteria (both gram negative and acid fast gram positive), mitochondria and chloroplasts. Despite recent publications describing reasonable levels of accuracy for discriminating between bbtm proteins and other proteins, screening of entire genomes remains troublesome as these molecules only constitute a small fraction of the sequences screened. Therefore, novel methods are still required capable of detecting new families of bbtm protein in diverse genomes. RESULTS: We present TMB-Hunt, a program that uses a k-Nearest Neighbour (k-NN) algorithm to discriminate between bbtm and non-bbtm proteins on the basis of their amino acid composition. By including differentially weighted amino acids, evolutionary information and by calibrating the scoring, an accuracy of 92.5% was achieved, with 91% sensitivity and 93.8% positive predictive value (PPV), using a rigorous cross-validation procedure. A major advantage of this approach is that because it does not rely on beta-strand detection, it does not require resolved structures and thus larger, more representative, training sets could be used. It is therefore believed that this approach will be invaluable in complementing other, physicochemical and homology based methods. This was demonstrated by the correct reassignment of a number of proteins which other predictors failed to classify. We have used the algorithm to screen several genomes and have discussed our findings. CONCLUSION: TMB-Hunt achieves a prediction accuracy level better than other approaches published to date. Results were significantly enhanced by use of evolutionary information and a system for calibrating k-NN scoring. Because the program uses a distinct approach to that of other discriminators and thus suffers different liabilities, we believe it will make a significant contribution to the development of a consensus approach for bbtm protein detection

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The transferome of metabolic genes explored: analysis of the horizontal transfer of enzyme encoding genes in unicellular eukaryotes

Author: McConkey Glenn A
Westhead David R
Whitaker John W
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Metabolic network analysis in multiple eukaryotes identifies how horizontal and endosymbiotic gene transfer of metabolic enzyme-encoding genes leads to functional gene gain during evolution

Crossref

Springer - Publisher Connector

PubMed Central

metaSHARK: a WWW platform for interactive exploration of metabolic networks

Author: Hyland Christopher
McConkey Glenn A.
Pinney John W.
Westhead David R.
Publication venue: Oxford University Press
Publication date: 01/01/2006
Field of study

The metaSHARK (metabolic search and reconstruction kit) web server offers users an intuitive, fully interactive way to explore the KEGG metabolic network via a WWW browser. Metabolic reconstruction information for specific organisms, produced by our automated SHARKhunt tool or from other programs or genome annotations, may be uploaded to the website and overlaid on the generic network. Additional data from gene expression experiments can also be incorporated, allowing the visualization of differential gene expression in the context of the predicted metabolic network. metaSHARK is available at

CiteSeerX

Crossref

PubMed Central

A novel method for comparing topological models of protein structures enhanced with ligand information

Author: Altschul
Barton
Barton
Berman
Berman
Bourne
Bradley
Bray
Brazma
Brenner
Chalk
Chandonia
David Gilbert
Doolittle
Gilbert
Gilbert
Gilbert
Gromiha
Harrison
Higgins
Holm
Koch
Madej
Mallika
Mallika Veeramalai
Michalopoulos
Mizuguchi
Nagano
Nobeli
Orengo
Russell
Sowdhamini
Sternberg
Torrance
Veeramalai
Viksna
von Grotthuss
Westhead
Westhead
Xue
Ye
Publication venue: 'Oxford University Press (OUP)'
Publication date: 07/10/2008
Field of study

This article is available open access through the publisher’s website through the link below. Copyright @ 2008 The Authors.We introduce TOPS+ strings, a highly abstract string-based model of protein topology that permits efficient computation of structure comparison, and can optionally represent ligand information. In this model, we consider loops as secondary structure elements (SSEs) as well as helices and strands; in addition we represent ligands as first class objects. Interactions between SSEs and between SSEs and ligands are described by incoming/outgoing arcs and ligand arcs, respectively; and SSEs are annotated with arc interaction direction and type. We are able to abstract away from the ligands themselves, to give a model characterized by a regular grammar rather than the context sensitive grammar of the original TOPS model. Our TOPS+ strings model is sufficiently descriptive to obtain biologically meaningful results and has the advantage of permitting fast string-based structure matching and comparison as well as avoiding issues of Non-deterministic Polynomial time (NP)-completeness associated with graph problems. Our structure comparison method is computationally more efficient in identifying distantly related proteins than BLAST, CLUSTALW, SSAP and TOPS because of the compact and abstract string-based representation of protein structure which records both topological and biochemical information including the functionally important loop regions of the protein structures. The accuracy of our comparison method is comparable with that of TOPS. Also, we have demonstrated that our TOPS+ strings method out-performs the TOPS method for the ligand-dependent protein structures and provides biologically meaningful results. Availability: The TOPS+ strings comparison server is available from http://balabio.dcs.gla.ac.uk/mallika/WebTOPS/topsplus.html.University of Glasgo

Crossref

Brunel University Research Archive

Bayesian refinement of protein functional site matching

Author: Gold Nicola D
Green Peter J
Mardia Kanti V
Nyirongo Vysaul B
Westhead David R
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Matching functional sites is a key problem for the understanding of protein function and evolution. The commonly used graph theoretic approach, and other related approaches, require adjustment of a matching distance threshold <it>a priori </it>according to the noise in atomic positions. This is difficult to pre-determine when matching sites related by varying evolutionary distances and crystallographic precision. Furthermore, sometimes the graph method is unable to identify alternative but important solutions in the neighbourhood of the distance based solution because of strict distance constraints. We consider the Bayesian approach to improve graph based solutions. In principle this approach applies to other methods with strict distance matching constraints. The Bayesian method can flexibly incorporate all types of prior information on specific binding sites (e.g. amino acid types) in contrast to combinatorial formulations. Results We present a new meta-algorithm for matching protein functional sites (active sites and ligand binding sites) based on an initial graph matching followed by refinement using a Markov chain Monte Carlo (MCMC) procedure. This procedure is an innovative extension to our recent work. The method accounts for the 3-dimensional structure of the site as well as the physico-chemical properties of the constituent amino acids. The MCMC procedure can lead to a significant increase in the number of significant matches compared to the graph method as measured independently by rigorously derived p-values. Conclusion MCMC refinement step is able to significantly improve graph based matches. We apply the method to matching NAD(P)(H) binding sites within single Rossmann fold families, between different families in the same superfamily, and in different folds. Within families sites are often well conserved, but there are examples where significant shape based matches do not retain similar amino acid chemistry, indicating that even within families the same ligand may be bound using substantially different physico-chemistry. We also show that the procedure finds significant matches between binding sites for the same co-factor in different families and different folds.</p

Crossref

Directory of Open Access Journals

OPUS - University of Technology Sydney

PubMed Central

Integrated analyses of chromatin accessibility and gene expression data for elucidating the transcriptional regulatory mechanisms during early hematopoietic development in mouse

Author: Bonifer Constanze
Gottgens Bertie
Hoogenkamp Maarten
Kouskoff Valerie
Lacaud Georges
Lichtinger Monika
Obier Nadine
Pearson Stella
Viiavabaskar Mahalingam S
Westhead David R
Publication venue: Epigenetics & chromatin
Publication date: 01/01/2013
Field of study

RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

Springer - Publisher Connector

PubMed Central

The University of Manchester - Institutional Repository

Apollo (Cambridge)

Arabidopsis Coexpression Tool:a tool for gene coexpression analysis in Arabidopsis thaliana

Author: Angelopoulou Antonia
Daras Gerasimos
Duddy William
Georgia Saxami
Hatzopoulos Polydefkis
Jen Chih-Hung
Malatras Apostolos
Michalopoulos Ioannis
Westhead David
Zogopoulos Vasileios
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Gene coexpression analysis refers to the discovery of sets of genes which exhibit similar expression patterns across multiple transcriptomic data sets, such as microarray experiment data of public repositories. Arabidopsis Coexpression Tool (ACT), a gene coexpression analysis web tool for Arabidopsis thaliana, identifies genes which are correlated to a driver gene. Primary microarray data from ATH1 Affymetrix platform were processed with Single-Channel Array Normalization algorithm and combined to produce a coexpression tree which contains ∼21,000 A. thaliana genes. ACT was developed to present subclades of coexpressed genes, as well as to perform gene set enrichment analysis, being unique in revealing enriched transcription factors targeting coexpressed genes. ACT offers a simple and user-friendly interface producing working hypotheses which can be experimentally verified for the discovery of gene partnership, pathway membership, and transcriptional regulation. ACT analyses have been successful in identifying not only genes with coordinated ubiquitous expressions but also genes with tissue-specific expressions

Directory of Open Access Journals

HAL-Inserm

PubMed Central

Ulster University's Research Portal

HAL-CEA

White Rose Research Online